Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

OmniPage vs. sakhr : Paired model evaluation of two Arabic OCR products

Identifieur interne : 002038 ( Main/Exploration ); précédent : 002037; suivant : 002039

OmniPage vs. sakhr : Paired model evaluation of two Arabic OCR products

Auteurs : T. Kanungo [États-Unis] ; G. A. Marton [États-Unis] ; O. Bulbul [États-Unis]

Source :

RBID : Pascal:99-0297244

Descripteurs français

English descriptors

Abstract

Characterizing the performance of Optical Character Recognition (OCR) systems is crucial for monitoring technical progress, predicting OCR performance, providing scientific explanations for the system behavior and identifying open problems. While research has been done in the past to compare performances of two or more OCR systems, all assume that the accuracies achieved on individual documents in a dataset are independent when, in fact, they are not In this paper we show that accuracies reported on any dataset are correlated and invoke the appropriate statistical technique - the paired model -- to compare the accuracies of two recognition systems. Theoretically we show that this method provides tighter confidence intervals than methods used in OCR and computer vision literature. We also propose a new visualization method, which we call the accuracy scatter plot, for providing a visual summary of performance results. 'This method summarizes the accuracy comparisons on the entire corpus while simultaneously allowing the researcher to visually compare the performances on individual document images. Finally, we report on the accuracy and speed performances as a function of scanning resolution. Contrary to what one might expect, the performance of one of the systems degrades when the image resolution is increased beyond 300 dpi. Furthermore, the average time taken to OCR a document image, after increasing almost linearly as a function of resolution, suddenly becomes a constant beyond 400 dpi. This behavior is most likely because the OCR algorithm samples the images at resolutions 400 dpi and higher to a standard resolution. The two products that we compare are the Arabic OmniPage 2.0 and the Automatic Page Reader 3.01 from Sakhr. The SAIC Arabic dataset was used for the evaluations. The statistical and visualization methods presented in this article are very general and can be used for comparing accuracies of any two recognition systems, not just OCR systems.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">OmniPage vs. sakhr : Paired model evaluation of two Arabic OCR products</title>
<author>
<name sortKey="Kanungo, T" sort="Kanungo, T" uniqKey="Kanungo T" first="T." last="Kanungo">T. Kanungo</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Center for Automation Research, University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author>
<name sortKey="Marton, G A" sort="Marton, G A" uniqKey="Marton G" first="G. A." last="Marton">G. A. Marton</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Center for Automation Research, University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author>
<name sortKey="Bulbul, O" sort="Bulbul, O" uniqKey="Bulbul O" first="O." last="Bulbul">O. Bulbul</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Center for Automation Research, University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">99-0297244</idno>
<date when="1999">1999</date>
<idno type="stanalyst">PASCAL 99-0297244 INIST</idno>
<idno type="RBID">Pascal:99-0297244</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000825</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000B69</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000768</idno>
<idno type="wicri:doubleKey">1017-2653:1999:Kanungo T:omnipage:vs:sakhr</idno>
<idno type="wicri:Area/Main/Merge">002149</idno>
<idno type="wicri:Area/Main/Curation">002038</idno>
<idno type="wicri:Area/Main/Exploration">002038</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">OmniPage vs. sakhr : Paired model evaluation of two Arabic OCR products</title>
<author>
<name sortKey="Kanungo, T" sort="Kanungo, T" uniqKey="Kanungo T" first="T." last="Kanungo">T. Kanungo</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Center for Automation Research, University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author>
<name sortKey="Marton, G A" sort="Marton, G A" uniqKey="Marton G" first="G. A." last="Marton">G. A. Marton</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Center for Automation Research, University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author>
<name sortKey="Bulbul, O" sort="Bulbul, O" uniqKey="Bulbul O" first="O." last="Bulbul">O. Bulbul</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Center for Automation Research, University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint>
<date when="1999">1999</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Document analysis</term>
<term>Document image processing</term>
<term>Document retrieval</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Traitement image document</term>
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance forme</term>
<term>Evaluation performance</term>
<term>Recherche documentaire</term>
<term>Analyse documentaire</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Recherche documentaire</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Characterizing the performance of Optical Character Recognition (OCR) systems is crucial for monitoring technical progress, predicting OCR performance, providing scientific explanations for the system behavior and identifying open problems. While research has been done in the past to compare performances of two or more OCR systems, all assume that the accuracies achieved on individual documents in a dataset are independent when, in fact, they are not In this paper we show that accuracies reported on any dataset are correlated and invoke the appropriate statistical technique - the paired model -- to compare the accuracies of two recognition systems. Theoretically we show that this method provides tighter confidence intervals than methods used in OCR and computer vision literature. We also propose a new visualization method, which we call the accuracy scatter plot, for providing a visual summary of performance results. 'This method summarizes the accuracy comparisons on the entire corpus while simultaneously allowing the researcher to visually compare the performances on individual document images. Finally, we report on the accuracy and speed performances as a function of scanning resolution. Contrary to what one might expect, the performance of one of the systems degrades when the image resolution is increased beyond 300 dpi. Furthermore, the average time taken to OCR a document image, after increasing almost linearly as a function of resolution, suddenly becomes a constant beyond 400 dpi. This behavior is most likely because the OCR algorithm samples the images at resolutions 400 dpi and higher to a standard resolution. The two products that we compare are the Arabic OmniPage 2.0 and the Automatic Page Reader 3.01 from Sakhr. The SAIC Arabic dataset was used for the evaluations. The statistical and visualization methods presented in this article are very general and can be used for comparing accuracies of any two recognition systems, not just OCR systems.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Maryland</li>
</region>
<settlement>
<li>College Park (Maryland)</li>
</settlement>
<orgName>
<li>Université du Maryland</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Maryland">
<name sortKey="Kanungo, T" sort="Kanungo, T" uniqKey="Kanungo T" first="T." last="Kanungo">T. Kanungo</name>
</region>
<name sortKey="Bulbul, O" sort="Bulbul, O" uniqKey="Bulbul O" first="O." last="Bulbul">O. Bulbul</name>
<name sortKey="Marton, G A" sort="Marton, G A" uniqKey="Marton G" first="G. A." last="Marton">G. A. Marton</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002038 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002038 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:99-0297244
   |texte=   OmniPage vs. sakhr : Paired model evaluation of two Arabic OCR products
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024